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Abstract 



^Kf^ ■ We consider random permutations derived by sampling from stick-breaking parti- 

tions of the unit interval. The cycle structure of such a permutation can be asso- 
I ciated with the path of a decreasing Markov chain on n integers. Under certain 

p ^ ' assumptions on the stick-breaking factor we prove a central limit theorem for the 

logarithm of the order of the permutation, thus extending the classical Erdos-Turan 
law for the uniform permutations and its generalization for Ewens' permutations 
associated with sampling from the PD/GEM(0) distribution p]. Our approach is 
based on using perturbed random walks to obtain the limit laws for the sum of 
logarithms of the cycle lengths. 

> ■ 

. Keywords: random permutation, Erdos-Turan law, stick-breaking, perturbed random 

)C} '. walk 
ON 

: 1 Introduction 
^ ■ 

Let &n be the symmetric group on [n] := {1, . . . ,n}. The order of permutation a G &n 
is the smallest positive integer k such that the fc-fold composition of a with itself is the 
^ I identity permutation. The order can be determined from the cycle representation of a 
as the least common multiple (l.c.m.) of the cycle lengths. For instance, permutation 
(T = (1 9 6 2) (3 7 5) (4 8) has order 12. 

A random permutation of [n] is a random variable with values in the set ©„. A 
widely known parametric family of random permutations has probability mass function 

p{n„ = a} = c-^0i"i, e>o, (1) 

where \a\ denotes the number of cycles, and the constant is c = (6*)^ := T{9 + n)/T{9). 
This family is sometimes called Ewens' permutations since the collection of cycle lengths 
is then a random partition distributed according to the Ewens sampling formula [31 129] . 
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The instance 6 = 1 corresponds to the uniform distribution under which all permutations 
(T G ©n are equally likely. 

For random permutation n„ with some fixed distribution let Kn^r be the number of 
cycles of length r and let Kn ■= |n„| = Ylr=i ^ri,r be the total number of cycles. We call 
vector {Kn^i, . . . , Kn n) the cycle partition of n„. In terms of the cycle partition the order 
of n„ is the random variable defined as 

On:=lc.m.{r e[n]: Kn^r>0}- (2) 

In a seminal 1967 paper (2] Erdos and Turan showed that for the uniform permutation 
the distribution of log 0„ is asymptotically normal. Arratia and Tavare j3j extended this 
result to Ewens' permutations, by showing that 

. = — )■ A/ (0,1), n — )■ oo. (3) 

V(^/3)log=^n 

The proof in [4] (see also [3j, Theorem 5.15), apparently the shortest one known, is based 
on the Feller coupling and asymptotic independence of the Kn^s. 

In this paper we generalize the Erdos- Turan law to a much richer family of random 
permutations derived from stick-breaking partitions of the unit interval by means of a 
simple occupancy scheme called Kingman's 'paintbox process' [29]. A toolbox of methods 
suitable for the study of Ewens' permutations is no longer applicable in the wider setting 
due to the lack of asymptotic independence of the Kn,rS. Instead, extending the line 
initiated in [101 EH 113 ESI EH [SOj [21], we apply the methods of renewal theory to obtain 
results on the weak convergence of the decisive quantity logT„ := ^^A^^ flogr which 
approximates the logarithm of the order of permutation. We show that the normal and 
other stable distributions can appear as limit laws, as determined by properties of the 
stick-breaking factor. 

There have been many studies of random permutations that are conditionally uniform 
given the value of some permutation statistic [3 E2l EH E]. Our motivation to consider 
the class of stick-breaking models has several sources, among which are the theory of 
regenerative composition structures [IH], more general exchangeable partitions [22] and the 
logarithmic combinatorial structures [3]. The present paper is the first study of a separable 
statistic Kn,rh{r) with unbounded function h for the partitions of integers derived from 
the general stick-breaking. It would be interesting to further study separable statistics 
and approximations to 0„ for other permutation models associated with exchangeable 
partitions. 

The organization of the rest of the paper is as follows. In Section [2] we introduce the 
class of permutations derived from the stick-breaking. The principal results are formulated 
in Section [31 In Section [H we prove that under various regularity conditions log T„ yields 
a good approximation to logO^ with an error term of the order o(log^''^n). In Section 
[S] we investigate the weak convergence of log T„ and prove Theorem 13. 2( the method 
here exploits a link between the K^^r^ and certain perturbed random walks. Theorem 
13.11 which is our generalization of the Erdos- Turan law follows then as a corollary. The 
auxiliary results used in the proofs are collected in the Appendix. 
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2 Permutations derived from stick-breaking 

The Basic Construction Let W he a. random variable, called stick-breaking factor, 
with values in (0, 1). Consider a multiplicative renewal point process Q with atoms 

j 

Qo:=l, Qj:=l[W„ jeN, 

i=l 

where Wi are independent rephcas of W. The gaps in Q yield a partition of [0, 1] in 
infinitely many intervals {Qj-^i,Qj] accumulating near 0. Let Ui,...,Un be a sample 
from the uniform [0, 1] distribution, independent of Q. A random permutation n„ is 
defined by organizing integers ii, . . . , in a cycle (ii ... ii) if the following occur: 

(i) Ui,<---< Ui^, 

(ii) the sample points Ui-^, . . . , Ui^ fall in the same interval {Qj+i, Qj] , 

(iii) only Ui^, . . . , Ui^ out of C/i, . . . , C/„ fall in this interval [Qj+i, Qj]- 

Listing the sample points in increasing order and inserting a | between two neighbouring 
order statistics if they belong to distinct component intervals of (0, 1] \ Q, the cycle 
notation of n„ is read left-to- right. 

For instance, the list U'j \ Us U4 U2 f/5 | Uq Ui yields permutation (7) (3 4 2 5) (6 1). 
To pass to the standard cycle notation (1 6) (2 5 3 4) (7) one needs to re-arrange the 
cycles in the order of increase of their minimal elements, and to rotate each cycle so that 
the least element of the cycle appears first. We prefer, however, to write the cycles and 
the elements within the cycles in accord with the natural order on reals, as dictated by 
the Basic Construction. A reason for this ordering of cycles is the following recurrence 
property: 

• Regeneration: form e — 1}, conditionally given the last cycle of n„ 
has length m, the cycle partition of n„ with the last cycle deleted has the same 
distribution as the cycle partition of Hn-m- 

It is straightforward from the construction that n„ also satisfies: 

• Coherence: permutations n„ are defined consistently for all values of n. Passing 
from n„+i to n„ amounts to removing integer n + 1 from a cycle. 

• Exchangeability: the distribution of Iln is invariant under conjugations in ©n- Equiv- 
alently, given the cycle partition {K^^i, . . . , Kn,n) the distribution of Tin is uniform. 

In combination with exchangeability, the regeneration property can be re-stated as follows: 
given the last cycle of n„ is of length m, a permutation resulting from deletion of the last 
cycle and re-labeling the remaining elements by the increasing bijection with [n — m] is a 
distributional copy of Hn-m- 

There are two further useful ways to generate the cycle partition of n„. 
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A Markov chain representation Consider a decreasing Markov ciiain on nonnegative 
integers witli absorption at and tlie decrement matrix 



q{n, m) = 




I < m < n, 



(4) 



specifying transition probabilities from n to n — m. For the Markov chain M„ starting at 
n, Kn,r is the number of jumps of size r on the path of M„ from n to 0. The arrangement of 
the cycle lengths in the Basic Construction corresponds to the decrements of M„ written 
in the time-reversed order. 



The infinite occupancy scheme This model is sometimes called the Bernoulli sieve 
[ini [H [151 dSl [m [il in]. Think of the gaps (Qj, Qj-i] as boxes 1,2,... with frequencies 



Given the frequencies, balls 1,2,... are thrown independently so that each ball hits box 
j with probability Pj. Then is the number of boxes occupied by exactly r out of the 
first n balls. 

Additive renewal process representation Mapping (0,1] to M+ via x i— )■ — logx 
sends Q to the additive renewal process with the generic increment — log W, and sends the 
uniform sample to a sample from the standard exponential distribution. The construction 
of permutation and the occupancy scheme are obviously re-stated in the new variables. 

It has been observed (see [TT], Theorem 2.1) that the instance of Ewens' permutation 
fits in the Basic Construction by choosing a factor W = beta(^^, 1), with the density 



A better known connection of Ewens' n„ to the stick-breaking stems from the fact that 
the scaled by n lengths of the cycles in the normalized notation converge as n — )■ oo to 

(Pi, P2, . . . ) as in (|5]) with Wj = beta(^, 1). The distribution of the limit is known as the 
GEM(6') law, which is related to the Poisson-Dirichlet PD(^)-distribution through a size- 
biased permutation of the terms. As a finite-n counterpart of this dual role of the stick- 
breaking, the sequence of lengths of cycles ordered by increase of the minimal elements 
and the reversed sequence of the cycle lengths derived from the Basic Construction have 
the same distribution. In particular, both sequences can be identified with the sequence 
of decrements of the Markov chain M„ with decrement matrix 



It follows from a result of Kingman that the coincidence of distributions of the two dif- 
ferent arrangements of the unordered set of the cycle-lengths characterizes the Ewens 
permutation within the family of random permutations with the regenerative property, 
see for this fact and variations. 



Pj:=W,W2---Wj^,{l-Wj), jeN. 



(5) 



F{W e dx} = 9x^-^dx, X e (0, 1). 
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We note in passing that by a version of the Basic Construction each system of coherent 
random permutations (n„)„gN with the properties of exchangeabihty and regeneration, 
with respect to deletion of a cycle of n„ chosen by some random rule, uniquely corre- 
sponds to a random regenerative subset of M+ which coincides with the closed range of 
a subordinator S [T^] . Distinguishing features of the subfamily in focus in the present 
paper are: (1) S" is a compound Poisson process with jumps distributed like | logW^I; (2) 
the last cycle of n„ has the length of the order 0{n) as n grows. 

3 Main results 

In the sequel we use the following notation for the moments of the stick-breaking factor 

fi:= E\logW\, := Var(logVr) and u := E\log{l - W)\, 

which may be finite or infinite. We shall also use the notion of slow variation. Function 
i : (0, oo) — )• (0, oo) is called slowly varying at oo if for all A > 0, 

hm -j-^ = 1. 

Our purpose is to extend ([3]) to a wider class of random permutations n„ derived from 
stick-breaking, along the following lines. 

Theorem 3.1. Suppose the law ofW is absolutely continuous with a density f. 

I. // there exist 6i > and 62 > such that f is nonincreasing on (0, 61), bounded on 
[Si,l — 62] and nondecreasing on (1 — ^2, 1) then 

(a) If cr^ < 00 then, with 

/ i-logn pz \ 

bn = fi'H2'^ \og^ n - J J F{\\og{l-W)\> x}dxdz\ (6) 

and a„ = {{3n^)~^a'^ log'^ n)^^"^ , the limiting distribution of {log On — bn)/an is 
standard normal. 

(b) If = 00, and 




y^F{\\ogW\ e dy} ~ £{x), x ^ 00, 



for some i slowly varying at 00, then, with bn given by ([6]) and 

an = (3/i'^)"^/^C[iog„] logn, 

where (c„) is any positive sequence satisfying lim ni{cn)/Cn = 1, the limiting 
distribution of (logO„ — bn)/an is standard normal. 
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(c) // 

P{|logl^|>x} ~ x-^iix), x^oc, (7) 
for some i slowly varying at oo and a G (1,2) then, with bn as in and 

an = ((a + l)/i""^^)~^/°CLiog„j logn, 
where (c„) is any positive sequence satisfying lim ni{cn)/c'^ = 1, the limiting 

n—^oo 

distribution of (logO„, — bn)/an is the a-stable law with characteristic function 
M h-^ exp{-|M|'T(l - a)(cos(7ra/2) + isin(7ra/2) sgn(M))}, m G M. (8) 
II. If for some a G [0, 1) 

sup a;°(l -a;)°/(a;) < oo; (9) 
xe[o,i] 

then 0"^ < oo and 

logO„ - (2/i)-Mog2n d .rtn ^\ ^ 
^^^^=^^^^^ — )■ A/ (0,1), — )■ oo. 

(3^3)-%2log3n 



In particular, these conditions cover all bounded densities, and all beta(a, b) densities 
with arbitrary parameters a,b > 0. Following an approach exploited by previous authors 
we derive our extension of the Erdos-Turan law in two steps. We first show that the 
accompanying quantity logT^ yields a good approximation to logO„, where 

n 

T„:=J]r^"-, (10) 

r=l 

is the product of cycle lengths of n„. Then we study the weak convergence of logT„. 

Functional logT„ is an instance of a separable statistic of the form '^^Kn,rh{r) (the 
terminology is borrowed from [261 |2Z], where it was used in the context of occupancy 
problems). Functionals Kn,r and Kn are themselves of this kind with some indicator 
functions h, but for logT„ the function h is unbounded. For Ewens' permutations quite 
general separable statistics were studied by Babu and Manstavicius, see e.g. [25]. 

Theorem 3.2. If W satisfies the moment conditions required, respectively, in parts (a), 
(b) and (c) of Theorem 13. 1|, then the conclusions of parts (a), (b) and (c) hold with logO^ 
replaced by logT„, without the assumption regarding the existence of density ofW. 

Example: beta distributions Assuming W = beta(^, 1) we have /x = 6^^, = 6'"^ 
and 

/o°^"/;p{|iog(i-^)i>x}dxd^ _^ 

nm , -i/o "5 

n.^oo log ' n 

since the numerator is O(logn). Application of Theorem 13.21 (a) yields 

- — A/ (0, 1), n — oo. 



'(e/3) log^n 

which was previously obtained in |3], equation (34). 
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4 Approximation of log On by log 

For j G [n] set 

[n/ji 

D ■= K = K 

r<n,j|r r=\ 

For the later use we need appropriate bounds for the expectation 1E(-Dnj — l)"*"- 
Lemma 4.1. Under the assumptions of Theorem 13.11 the asymptotic relations 

E{D„, - 1)+ = O (!^) , (11) 

nD„,^r,- = 0(^^) (12) 

hold uniformly in j G [n] . 

Proof Define := J2l=i^ Kn,rj. It is obvious that {D^j - 1)+ < I^i']. 

Let An be the length of the last cycle of Un, with distribution P{A„ = j} = q{n,j) as 
in @. One can check that the bivariate array (-D^^]) satisfies the distributional recurrence 

= lb1A„,,<A„<n-,} + ^i'2A„„ n>J, (13) 

where the variables D^]. are assumed independent of n„ and marginally distributed like 
\ n, /c G N. Taking expectations yields 

L"/iJ-i n 
^^2= P{^n = rj} + ^P{n-A„ = i}EDg forn>j, 

r=l j=j 

and EDJ,^] = forra < j. 
By Lemma 16.21 

ln/j]-l 

j J2 nAn = rj} = 0{l), j<n, neN. (14) 



r=l 

Now relation ffTTj) follows by the virtue of part (i) of Lemma 16.11 and Lemma 16.31 with 
Cj = J- 

To prove the second assertion f|T2|) . note that 



(^ri,i - 1) - {Dn,j - 1) l{X„,L„/ij,=0} - {D^j - 1) l{i^„,L"/jJi=0} 

;,(2) 



< - 1)+ < <](^S] - l)/2 =: 
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holds almost surely. Squaring relation f lT5]) and using f lT^ and f lTT]) yield 

n 

ED^ = 0{r^\ogn) + ^P{n - A„ = z}e45, n > J, J G N, 

Finally, application of part (ii) of Lemma [6. II and Lemma [6.31 with Cj = establish f|T2l) . 
as wanted. □ 

The following estimate of the difference log T„ — log On generalizes Lemma 4 in [1] . 

Lemma 4.2. Under the assumptions of Theorem 13.11 the following asymptotic relations 
hold 

E (log Tn — log On) = O (log n log log n) , n ^ oo. 
Proof. We start with a known representation (p. 289 in |24j ) 

logT„ - logo, = ^logp^(D„,p. - 1)+, 

pev s>i 

where V denotes the set of prime numbers, which implies 

E(logT„-logO„) = logpE(Z}„,p. - 1)+ 

peV,s>i 

< ^ogpE{Dn,p - ly + logpE(D„,p. - 1)+ 

p6'P,p<logn pGP, s>2, p** <log n 

+ Y logi'E(^n,J- - 1)^ =■ Si{n) + ^2(^) + Ss{n). 

i>logn 

Applying (ITT!) along with Theorem 4.10 in [2\ which states that 

^Y^ = logo; + 0(1), X — > oo, 

pev,p<x ^ 

proves Si{n) = O (log n log log n). Using (ITT]) again yields 5*2 (n) = O(logn). Finally, from 
( IT2|) we infer 5*3 (n) = O (log n log log n). The proof is complete. □ 

5 Weak convergence of log 

To prove Theorem 13.21 we shall exploit a strategy as in [H] (see also [20]), which amounts 
to connecting the asymptotics of logT„ (as n oo) with that of the 'small frequencies' 
Pk (as k — )■ oo). Since the process (logPA;)^^^ defined by ([S]) is a particular perturbed 
random walk, we start in Subsection 15.11 with developing necessary backgrounds on the 
perturbed random walks. These results are further speciahzed to log Pk in Subsection [ 
which eventually allows to complete the proof of Theorem 13.21 
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5.1 Results for perturbed random walks 

Let (^fc, rik)keN be independent copies of a random vector 77) with arbitrarily dependent 
components ^ > and rj > 0. We assume that the law of ^ is nondegenerate and that the 
law of Tj is not the Dirac mass at 0. Set F{x) := P{?7 < x} and r(x) := f^{l — F{y))dy. 
For {Sk)kmo ^ random walk with Sq = and increments ^k, the sequence (Tk)ke^ with 

Tfc := Sk-i +r]k, k e N, 

is called a perturbed random walk. Since limT^ = 00 a.s., there is some finite number 

fc— >oo 

N{x) := i^{k eN:Tk<x}, x>0, 
of sites visited on the interval [0, x]. Set also 

p{x) := eNo: Sk<x} = m{{k eN: Sk> x}, x > 0, 

and 

Mix) :=5^E(l|r,^^<,||Sfe) =5^F(x-^,), x>0. 

k>0 k>0 

The main result of this subsection is given next. 
Theorem 5.1. Assume that m := < 00 and 

P{X) - m~^X d ry 

— ^ 2; — )■ CXD. 

c\x) 

Then 

T( ^ j;iN{y)-m-\y-r{y)))dy , f\, ^ _^ 

I{x) := — / Z{y)dy =: X, x 00, 

xc{x) Jo 

where {Z{t))t>o is a stable Levy process such that Z{1) has the same law as Z . 

Remark 5.2. It is known (see Proposition 27 in p8j) that c(x) ~ x'^iilx) for some 
P G [1/2, 1) and some slowly varying ii, where /3 and ii depend on the distribution of ^. 
Furthermore, if /3 = 1/2 then either ii{x) = const or lim£i(a;) = 00. Thus, in any case, 

0(1), a; ^00. (15) 



c^{x) 

The proof of Theorem 15.11 relies heavily upon the following 



Lemma 5.3. Under the assumption and notation of Theorem 15. 1|, 

XC[X) 

Proof. It is known (see Theorem lb in [7J) that 

p(x-) — ra.~^(x-) , , , , 

W,{-) := \ ^ ^ ^ Z ■ , 17 

c[x) 

in D[0, 00) in the Mi-topology. Since integration is a continuous operator from D[0, oo) 
to D[0, 00), we have 

/ W^{y)dy A I Z{y)dy, x ^ oo, 
Jo Jo 

which is equivalent to ( !T6|l . □ 
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Remark When Z{-) is a Brownian motion, the one-dimensional convergence in fITB]) 
can be upgraded to the functional limit theorem. Indeed, since {Z{t)) is continuous the 
convergence in (fT7|) is equivalent to the locally uniform convergence. Furthermore, the 
integration z{-) ^ z{y)dy is continuous w.r.t. the locally uniform convergence. 

Hence, by the continuous mapping theorem, 

"(■) /•(■) 



n-) n-) 
/ W^{y)dy / Z{y)dy, x ^ oo 
Jo Jo 



in L)[0,oo). 

Lemma 15.41 collects some facts borrowed from 

Lemma 5.4. (a) E{N{x) — M(x))^ = o{x), as x ^ oo. 
(b) Under the assumption and notation of Theorem 15. 1|. 



sup {p{y) - m ^y) 



— )■ sup Z(t), as X — 7- cxD, 



'^{^j te[o,i 
and 

inf ^ {p{y) - m^^y) 

— 7- mi Z[t), as a; — >■ oo. 

c{x) te[o,i] 

Proof of Theorem 15. 1[ Applying the Cauchy-Schwarz inequality, 



x'^c^{x) xc^{x) c^(a:)' 

where for the final estimate Lemma [5. 4( a) was utilized. In view of flT5]) . the latter expres- 
sion goes to 0, which implies that 

j;{N{y) - M{y))dy p 

— —^0, X ^ oo. (18) 

xc{x) 

Since 

j;{N{y) - m-\y - r(y)))dy _ J^iNjy) - M(y))dy ^ J^jMjy) - m-\y - riy)))dy 
xc{x) xc{x) xc{x) ' 

we have to prove that the second summand converges in distribution to X. 
With S G (0, 1) such that y^ = o{c{y)), write for ?/ > 1 

F{y) + M{y)-m-\y-r{y)) = f {p{y ~ z) - m-\y - z))dF {z) 

Jo 

. . . + 

T,{y)+T,{y). 
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In view of 

TM < {p{y) - r^-'y)F{y') + m-'y'F{y') < {p{y) - m-'y) + m-^/ 

we have 

i:Uy)dy ^ j;T,iy)dy , J^ipjy) - ra-^y)dy ^ jS + iy^x' , ^ , ^ , ^ 

r~\ — r~\ ^ r~\ ^ r~\ — ~^ u + a+ u- a, 

xc[x) xc[x) xc[x) mc[x) 

where the last step is justified by Lemma [5.31 and the choice of 6. Further, 

Tiiy) > {p{y) - m-'y) - {p{y) - m-^y){l - F{y')) - {p{y) - p{y - /)). 

Since 

^!^{p{y)-p{y-y'))dy ^ j^^p{y')dy ^ ¥.p{x') x' ^ . q ^ ^ 
xc{x) ~ xc{x) ~ x^ c{x) ' 

by the elementary renewal theorem and the choice of 5, we conclude that 

K{p{.y) - pjy - y^))dy ^ ^ 

xc{x) 

Therefore, 

xc{x) ~ xc{x) c(x) X 

Iiipjy) - p{y-y^))dy 

xc{x) 

4 x-o-o = x, 

by Lemma 15.31 and Lemma 15.41 (b) . 
Finally, 

inf {p{z)-m-^z){F{y)-F{y'))<T2{y)< sup {p{z) - m-h){F{y) - F{y')) 

0<z<y 0<z<y 

entails 

j;Uy)dy ^ j:T,{y)dy ^ ofL^^^'^ '""""^ KjFjy) - F{y^))dy p 
xc{x) ~ xc{x) c{x) X ' 

where the last step follows from Lemma ISTiT b) and the trivial fact that the last ratio goes 
to for any distribution function F. Similarly, 

j;T,{y)dy ^ j;T,{y)dy ^ o'^M'^ -"""'U^jFiy) - F{y^))dy p 
xc{x) ~ xc{x) c{x) X 

Putting the pieces together completes the proof. □ 
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5.2 Proof of Theorem D and Theorem [331 

Proof of Theorem \3.S[ We shall make use of the Poissonized version of the occupancy 
model with random frequencies (Pk), in which balls are thrown in boxes at epochs of a 
unit rate Poisson process {7it)t>o- For simplicity we use notation V(t) = logT^^. 
Set 

p*{x) := inf{A; G N : VTi . . . VTfc < e"^}, x > 0, 

and 

N*{x) := G N : Pfc > e"^} 

= if{keN:W,---Wk-i{l-Wk)>e-''}, x>0. 

First of all, we need a refined large deviation result for (tt^) itself: for t > 1, 

F{nt < (1 - et)t} < exp{-t{et + log(l - 6t){l - =: q{t), (19) 

where St := t"^, for any (3 G (0,1/2). Note that limg(t) = with (-logg(t)) ~ t^"^^. 

t—^oo 

Inequality (1191) is the Chernoff bound for the Poisson distribution and follows in a standard 
way by first applying Markov's inequality to e~*'^* and then minimizing the right-hand 
side over s. 

For j = 1, 2, set 

f,{t) := E(log+ TTtY = e-' J2 log' Ht'/kl), t > 0. 

k>2 

These functions are nondecreasing and differentiable with fj{0) = and 

/i(0) = 0. (20) 

Let us prove that 

lim(/i(t)-logt)=0 (21) 

and 

lim/i(t) = 0, (22) 

t—^oo 

where h{t) := Var(log^ tt^). To this end, write 

flit) - logt <E\og{7it + 1) - logt < log(t + 1) - logt < t-\ (23) 
where at the second step Jensen's inequality has been utilized. Similarly, 

f2it)-\ogH < E\og'int + l)-\ogH 

< log2(t + 1) -log^t 

< 2rMog(t + l). (24) 

Note that we actually work on the set {nt > 2} and that the function t i— log^(l + 1) is 
concave for t > 2. 
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Furthermore, for large enough t, and defined above, 

- logt > E(log+ nt - logt)l{^,>(i_,,),} - \ogtF{nt < (1 - Et)t} 
> log(l - et)F{n > (1 - £t)t} - q{t) hgt =: p{t). 

and the last expression goes to zero (with rate t"''), as t — )■ oo. Combining this inequality 
with ([23]) proves ([21]). Note also that 

/2(t) = log2t + 21ogt(/i(t)-logt) + (/i(t)-logt)2 

> log^t + 2p(t)logt. (25) 



Hence 



|[24ll.ll25ll 

h{t) = hit) - flit) < 2(t-i log(t + 1) - Pit) \ogt) = Oi\ogt/t^), 



which proves ([22])0. 

The basic observations for the subsequent work are given and proved next: 

E(\/(t)|(P,)) = 5^/i(tP,) 

i>i 

/i(Vx)diV*(logx) 

g* 

(log t - x)dN* ix) + Op(log t) (26) 

g* 

N*ix)dx + Opi\ogt) 

Jo 

and 

Var(V^(t)|(P,)) = J]/^(tP,) 

= Op(logt), (27) 

where Op(logt) means that Op(logt)/ logt is bounded in probability. 

The a.s. finiteness of the conditional expectation (and even its integrability) can be 
justified as follows: 

ElogT„ < (log+n)Efs:„ < ralog+n. 

Hence KVit) < Evr^ log^ nt < oo. The integrability of the conditional variance can be 
checked similarly. 

Since iV*(log?/) < p*(log?/), and p*(log?/) = Op (logy) we conclude that 

iV*(logi/)=Op(logy). (28) 



^Alternatively, both (|2l]) and (|22|) can be deduced from Theorem 4 in [23]. To keep the paper self- 
contained we prefer to give an elementary real-analytic argument. 
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Using this and f l^ gives 

"logt 



/t />10g£ 
fi{t/x)dN*{\ogx) = J (logt - x)dN*{x) + Op(logt). 

In fact, only boundedness of — logt was used. Further, 

/i(t/x)diV*(logx) = -/i(l)iV*(logt)+ /'iV*(logt-logx)/((x)dx 



< Op(logt)+p*(logt)/i(l) 

+ / (p*(logt - logx) - p*(logt))/((a;)da; 

= Op{logt), 



since by the well-known bound for the renewal function 
E 



; / (p*(logt-loga;)-p*(logt))/((x)dx< / {Ci\\ogx\ + C2)f[{x)dx < oo, 
Jo Jo 

where Ci and C2 are positive constants. Thus we have proved ( l26l) . The proof of ( 1271) 
follows the same pattern, the only minor difference being that now we use inequality 

/•oo 

h{t/x)dN*{\ogx) < / /2(t/x)dA^*(logx) 
Jt Jt 

and for /s. 

Throughout the rest of the proof we apply results of Subsection 15.11 to the vector 
i^yV) •= (I logW^I; I log(l ~ With this specific choice the quantities p{x) and N{x) 

defined in Subsection 15. II turn into p*(x) and N*{x). 

Let {X(t))t>o be a Levy process with logEe*^^^^) = i'iz), z eM. Then 



logEexp l^iz^ X{t)d?j = 



i){zs)ds, (29) 



which follows from a Riemann approximation to the integral. 

Assume that the assumptions of Theorem 13.21 hold which implies that the assumption 
of Theorem 15.11 (with p replaced by p*) holds. By scaling Z and c(x), if necessary, we 
can assume that Z has the standard normal distribution under the assumptions of parts 
(a) and (b) of Theorem 13.21 and that Z has a stable law with characteristic function ([8]) 

under Then ([29D implies that X = S'^/^Z = 7^^(0,1/3) in the first case, and that 

X = {a + 1)~^/°Z in the second case. 
By Theorem 15.11 



C\N*iy)-fi-\y-r*iy)))dy , 

— — — A, t ^ 00, 

c(log t) log t 
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where r*{y) := F{\ log(l - W)\ > z}dz. Since limc(t) = oo, using ([26]) yields 

t—>-oo 

E{Vit) I (P,)) - f,-' (^2-1 log^ t - J^°''r*{y)dy^ 



n +M + ^ X, t ^ oo, 

c(log t) log r 

and hence 

V{t)-f,~^(2-Hog't-J^°^'r*{y)dy) 

^— — — -— ^ Ax, t^oo, 

c(log t) log I 

by virtue of (1271) and Chebyshev's inequality. 

Now we have to de-Poissonize, i.e., to pass from the Poissonized occupancy model to 
the fixed-n model. This is simple as (logT„) is a nondecreasing sequence. Set 

b{t) := fi-^ \2-^ log^ t- J P{ I log(l -W)\> y}dy\ and a{t) := c(log t) log t. 

Recall that we take a properly adjusted c(x). Since a{t) grows faster than the logarithm, 
we have 

,.^K«)-M[«(i±e)J),„_ 

t^oo a[t) 
for every e > 0. This together with slow variation of a{t) give 

^ a(L«(l±£)J) 



By the monotonicity of (logT„), we have 

X+(t) = X+(t)l,5.+X+(t)l(z),)c 



%+,),j-&(Lt(i + 5)j) 

- a(Lt(l + e)J) lA+X+(t)l(DO^ 



where A := (vrt G [[(1 -£)tj, [(1 +^)^J]}- Since P(A) ^ 1, hence X+(t)l(z3,)c 4 0, 
we conclude that 

P{X > x} < hminf pji^^^^V^^ > X 

n^oD a[n) 

for all a; G M. To prove the converse inequality for the upper bound one can proceed in a 
similar manner. 

It remains to set 6„ = 6(n), and a„ = (a + l)~^^°'a{n) if the assumption of part (c) 
holds, and a„ = 3~^''^a(n) if the assumptions of parts (a) and (b) hold. The fact that the 
so-defined and 6„ are of the form as stated in Theorem 13.21 follows from considerations 
above and from, for instance. Proposition 27 in [2B]- The proof of Theorem l3.2l is complete. 
Proof of Theorem \3.1\ By Theorem 13. 2 [ (logT„ — bn)/an, with case- dependent a„ and 6„ 
defined in Theorem 13. weakly converges. In particular, we know that log^^'^n = 0(a„). 
It remains to apply Lemma 14.21 and Markov's inequality. The proof of Theorem 13.11 is 
complete. 
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6 Appendix 

The following lemma is a simple consequence of Proposition 3 in [lOj . 

Lemma 6.1. Assume that the sequence an satisfies the following recurrence relation 

n 

ao = 0, an = bn + ^ q{n, k)an-k, n eN. 

k=0 

Then 

(i) if bn = 0(1) then an = O(logn), as n oo, 
(a) ifbn = O(logn) then an = O(log^n), as n oo. 
The next lemma verifies (fUl) which is a key ingredient of the proof of Lemma 14.11 



Lemma 6.2. Relation (iMl) holds provided the density f ofW satisfies any of the following 
two conditions: 

(i) condition IQ holds for some a G [0, 1). 

(ii) there exist 5i > and 62 > such that f is nonincreasing on {0,6i), bounded on 
[61, 1 — 62] and nondecreasing on (1 — 62, 1). 

Proof. We start with easier part (i). We have 

[n/fcj-l [n/fcj~l 



k y: nAn = rk} = y: irk) / ^-''(i-^rmdx 

r=l r=l ^ / "'O 

< const V / a;"-'^'-°(l-a;)^''=-"dx 



k ^ r(n + l)r(n - rA; - a + l)r(rA; - a + 1) 

l-EVT" ^ r(n - 2a + 2)r(n - rA; + l)r(rA; + 1) 

1 k 

r=l 

^ ^°"^^T3W^;^ E ((L-AJ-r)r)-" = 0(l), 



T- = l 



The fourth line is a consequence of the inequality given in [T], formula (6.1.47): for 
c, d > — 1 there exists M^^ > such that for all n G N 



T{n + d) 
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The equality in the last line follows from the estimate Yyf=iii^ ~^ ^ const m^~^" 

which holds for a <1 and m e N. The proof of part (i) is complete. 
Passing to part (ii) we can write 

f{x) = < 5i]h{x) + P{5i <W<1- 62]f2{x) + > 1 - S2}f3{x), (30) 

where /i, /2 and /s are some densities such that fi is nonincreasing on (0,1), /s is 
nondecreasing on (0,1) and /2 is bounded on (0,1). It is known (see [22]) that if a 
random variable X with support [0, 1] has a nonincreasing (nondecreasing) density h then 
there exists a distribution function G such that h(x) = ^^^^ (resp. h{x) = J^_^ ^^^^)- 
Using this observation fl30l) can be rewritten as follows 

fix) = F{W < 6i} / ^fMldG'i(y) + P{5i <W<1- 62}f2{x) 

Jo y 

+ ¥{W>l-62} r^^^^MldG2(2/), 

Jo y 

where Gi,G2 are some distribution functions concentrated on [0,6i] and [1 — 52,1], re- 
spectively. 

The last formula can be seen as a representation of / as a convex linear combination 
of the densities of three types: ge{x) = e~^l{xe[o,e]}, he{x) = e~^l{xG[i-e,i]} and bounded 
densities. Thus to prove (ii) it is enough to show that relation f lT4|) holds for densities 
of these types uniformly in e G (0, 1). The validity of (fT4|) for bounded densities follows 
from part (i) of the lemma (take a = 0). We only check ([T^ for g^, as the argument is 
symmetric for /i^. We have 

P{A„ = k}= Q e-' j^p\l - pr-^dp = -^-^I,{k + 1, n - A; + 1), 

where /^(A; + l,n — k + 1) is the normalized truncated beta-function (see formula (6.6.2) 
in [1]). Using formulae (6.6.5) and (6.6.4) of the same reference we obtain 

F{An = k} = -^/,(fc,n-fc+l)+ /7f F{B > k+1} < -^+ , P{i? > A;+l} 

n + 1 {n+l)e n + 1 {n + l)e 

where a random variable B has the binomial distribution with parameters {n,6). This 
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yields 

[n/fcj-l \n/k\-l 

k ^ nAn = rk} < k y: (;r-i + t^ttiv^^^ ^ + 

r=l r=l \ ^ J 

< 1 + 7 -— y F{B>rk + l} 

^ [n/k\-l n 



(n + l)e ^ ^ ^ 

^ ' r=l j=rfc+l 

1 " 

< 1 + - — -y?p|5 = 7i<2. 



The proof of part (ii) is complete. □ 

Lemma 6.3. Let{})n{k))nm,\<k<n, (c„)„gN and{dn)nen be nonnegative arrays. Lei (a„(fc))„gN„,fceN 
and (a^)„gNo be defined recursively via 

ao{k) = ai{k) = . . . = ak-i{k) = 0, A; G N; 

n-l 



Xn(/i;) = + ^p„,iai(A;), /c < n, /c e N; 



i=k 

and 

n-l 

= 0, = (i„ + ^p„,ja-, n G N, 

respectively, where {pn k)o<k<n-i is a probability distribution, for every fixed n eN. 
If 

Ckbnik) <dn, n eN, k < n, k e N, (31) 

then 

Ckanik) < <, n e N, k < n, k e N. (32) 

Proof. We shall prove the lemma by induction on n. The base of induction is straight- 
forward. Assume that f l5^ holds for all positive integer n < N and k < n. We have to 
prove ([32]) for = + 1 and fc<A^ + l,fcGN. Assume first that k < N, then 

^ E3 ^ 



CkaN+i{k) = CkbN+i{k) + YpN+i,iCkai{k) < d^+i + y^^PN+i,iCkai{k) 

i=k i=k 
. , . N N 

induction . ^ ^ — ^ ^ ^ 

< + ^^PAT+l.^ttj < d]\f+i + / ^PN+l,iO'i = C^AT+i- 

j=0 
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For k — N + 1 we have 



CN+iaN+i{N + 1) = CN+ibN+i{N + 1) < dN+i < a'j^^^. 
The proof is complete. □ 
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